AITopics | relu activation function

On the VC dimension of deep group convolutional neural networks

Neural Information Processing SystemsJun-21-2026, 17:37:31 GMT

Equivariant neural networks outperform traditional deep neural networks on a number of tasks. The theoretical understanding of their generalization properties remains, however, limited. In this paper, we analyze the generalization capabilities of Group Convolutional Neural Networks (GCNNs) with ReLU activation function through the lens of Vapnik-Chervonenkis (VC) dimension theory. By deriving upper and lower bounds, we investigate how the network architecture affects the VC dimension.

artificial intelligence, deep learning, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > United States (0.67)
Europe (0.67)

Genre: Research Report > Experimental Study (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Learn2Mix: Training Neural Networks Using Adaptive Data Integration

Neural Information Processing SystemsJun-16-2026, 13:59:30 GMT

Accelerating model convergence in resource-constrained environments is essential for fast and efficient neural network training. This work presents learn2mix, a new training strategy that adaptively adjusts class proportions within batches, focusing on classes with higher error rates.

artificial intelligence, deep learning, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.68)

Genre: Research Report > Experimental Study (1.00)

Industry:

Banking & Finance (0.46)
Government (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

a922b7121007768f78f770c404415375-Paper-Conference.pdf

Neural Information Processing SystemsApr-27-2026, 01:20:26 GMT

abstraction, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England (0.28)
North America > United States > New York (0.28)

Genre:

Research Report (0.93)
Instructional Material > Course Syllabus & Notes (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Polyhedron Attention Module: Learning Adaptive-order Interactions Anonymous Author(s) Affiliation Address email Appendixes1

Neural Information Processing SystemsApr-25-2026, 15:31:13 GMT

Contents2 ADeriving Eq. 2. 23 BThe hyperplane set generated by the oblique tree is a superset of that created by the4 ReLU-activated plain DNN 35 CProof of Theorem 1 46 DProof of Theorem 2 57 EProof of Theorem 3 68 FProof of Theorem 4 79 GImplementation Detail 810 We consider a L-layer (L 2) ReLU activated plain DNN module f: Rn0 RnL with input12 x Rp. Eq. 2 in the main text can be30 obtained by rewriting P An oblique tree is a binary tree where each node splits the space by a hyperplane rather than by34 thresholding a single feature. The tree starts with the root of the full input space S, and by recursively35 splitting S, the tree grows deeper. For a D-depth (D 3) binary tree, there are 2D 1 1 internal36 nodes and 2D 1 leaf nodes. As shown in Figure 1, each internal and leaf node maintains a sub-space37 representing a polyhedron in S, and each layer of the tree corresponds to a partition of the input38 space into polyhedrons.

activation state, artificial intelligence, machine learning, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

258be18e31c8188555c2ff05b4d542c3-Supplemental.pdf

Neural Information Processing SystemsApr-25-2026, 04:02:03 GMT

activation function, artificial intelligence, machine learning, (19 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.50)

Add feedback

Supplementary Material: Repulsive Deep Ensembles are Bayesian ANon-identifiable neural networks

Neural Information Processing SystemsApr-24-2026, 23:34:28 GMT

Deep neural networks are parametric models able to learn complex non-linear functions from few training instances and thus can be deployed to solve many tasks. Their overparameterized architecture, characterized by a number of parameters far larger than that of training data points, enables them to retain entire datasets even with random labels [84]. Even more, this overparameterized regime makes neural network approximations of a given function not unique in the sense that multiple configurations of weights might lead to the same function. Indeed, the output of a feed forward neural network given some fixed input remains unchanged under a set of transformations. For instance, certain weight permutations and sign flips in MLPs leave the output unchanged [9].

artificial intelligence, machine learning, neural network, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback